Triadic closure dynamics explains scaling-exponents for preferential attachment-, 
degree- and clustering distributions in social multiplex data 
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Social networks exhibit scaling-laws for several structural characteristics, such as the degree dis- 
tribution, the scaling of the attachment kernel, and the clustering coefficients as a function of node 
degree. A detailed understanding if and how these scaling laws are inter-related is missing so far, 
let alone whether they can be understood through a common, dynamical principle. We propose a 
simple model for stationary network formation and show that the three mentioned scaling relations 
follow as natural consequences of triadic closure. The validity of the model is tested on multiplex 
data from a well studied massive multiplayer online game. We find that the three scaling exponents 
observed in the multiplex data for the friendship, communication and trading networks can simul- 
taneously be explained by the model. These results suggest that triadic closure could be identified 
as one of the fundamental dynamical principles in social multiplex network formation. 
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Social networks often exhibit statistical structures that 
manifest themselves in scaling-laws which can be quan- 
tified through a set of characteristic exponents. Maybe 
the three most relevant scaling laws in terms of network 
formation are the linking probability for new nodes join- 
ing the network as a function of degree of the existing 
(linked-to) node, the degree distribution, and the clus- 
tering coefficient of nodes as a function of their degree. 
In particular, the probability for a node to acquire a new 
link, the attachment kernel H(k), often scales with the 
node degree k [l|, Q as 



II(fc) oc k 1 



(1) 



The degree distribution of social networks, i.e. the prob- 
ability to find a node with a given degree k, P{k), of- 
ten shows features of exponential, fat-tailed distributions 
[H, |3] or something inbetween, depending on the type of 
social interaction [a, [f| • They can be parameterized con- 
veniently by the g-exponential @, HI , 



P(k) oc (1 + (1 - ^fc) 1 -' 



(2) 



with q a parameter that determines an asymptotic scal- 
ing exponent 1/(1 — g). A third scaling law, which is 
ubiquitous in social networks 0, 0, 0, [l(]| , is observed for 
the clustering coefficients c(k) as function of node degree, 



c(k) oc k-l 3 



(3) 



Despite the overwhelming empirical evidence for the 
scaling laws in Eqs. [TJ-[21 it is still undecided if they share 
a common dynamical origin, and if and how characteris- 
tic exponents are related to each other. For example, for 
growing network models, where new nodes are constantly 
added which link through a preferential attachment rule 
to already existing nodes [3], a relation between scaling 
exponents of the degree distribution and the attachment 
kernel 7 has been found [ll|. However, these models can 



not explain the observed scaling of the clustering coeffi- 
cients. Moreover, the preferential attachment process Q 
requires global information (the degrees of all nodes in the 
network) to establish a new social tie, which is clearly an 
unrealistic assumption for most social networks. To over- 
come this problem, growth and preferential attachment 
mechanisms have been extended by local network forma- 
tion rules (l2l - [l5| . where a node's linking dynamics only 
depends on its neighbors or second neighbors. One such 
local rule which is extremely relevant for social network 
formation is the principle of triadic closure 16|,ll7[, which 
means that the probability of a new link to close a triad 
is higher than the probability to connect any two nodes. 
Scaling-laws for the degree distribution [l3j], degree dis- 
tribution and clustering coefficients 1J] , and preferential 
attachment 15[ have been reproduced in the context of 
specific models using triadic closure, respectively. While 
it is instructive to see how a combination of growth, pref- 
erential attachment and clustering processes give rise to 
the three scaling laws above, this does not help us to un- 
derstand if the existence and possible inter-relations of 
the three exponents can emerge from a single underlying 
dynamical origin, and to which extent this common ori- 
gin is an actual feature of real social network formation 
processes. Less is known on relations between charac- 
teristic exponents in non-growing, stationary networks 
7, 18]. It has been shown that triadic closure is related 
to scaling-laws for the degree distribution and clustering 
coefficients in the stationary case 



Here we present a simple model that simultaneously 
explains the three scaling laws in Eqs. |T] - [3] based on 
the process of triadic closure in non-growing networks. 
It introduces a mechanism from which preferential at- 
tachment emerges, leads to fat-tailed degree distribu- 
tions, and induces scaling of the clustering coefficients 
with node degrees. The model is validated with data 
from a social multiplex, i.e. a superposition of several so- 
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FIG. 1. Node i (with more than two links) and one of its 
neighbors j are randomly selected. With probability r the 
process of triadic closure takes place (triad consists of i,j,k), 
with probability 1 — r, j links to a random node. 



cial networks labeled by a with adjacency matrices M a , 
defined on the same set of nodes 



22J. The model can 



be fully calibrated with the multiplex data and explains 
three observed characteristic exponents for three differ- 
ent sub-networks of the multiplex. 

The model is built around the process of triadic clo- 
sure, the principle that links tend to be created between 
nodes that share a neighbor. The model includes the ad- 
dition and removal of nodes. The network is initialized 
with N nodes, each node having one link to a randomly 
chosen node. The dynamics is completely specified by an 
iteration of the following steps, starting at t. 

1. Pick a node i at random. If i has less than two 
links, create a link between i and any randomly 
chosen node, and continue with step 3. If i has 
two or more links, choose one of its neighbors at 
random, say node j, and continue with step 2. 

2. With probability r (triadic closure parameter), cre- 
ate a link between j and another randomly chosen 
neighbor of i, say k. With probability 1 — r, create 
a link between j and a node randomly chosen from 
the entire network, see Fig[TJ 

3. With probability p (node-turnover parameter) re- 
move a randomly chosen node from the network 
along with all its links, and introduce a new node 
linking to m randomly chosen nodes. Then con- 
tinue with time-step 4 + 1. 

For p > nodes have a finite lifetime, which implies that 
the network reaches a stationary state where the total 
number of links L(t) and the network measures TL(k), 
P(k), and c(fc) fluctuate around steady state levels. The 
model is a variant of the model proposed in (l9j , which is 
contained as the special case r = 1 in the above protocol. 
Our model can also be seen as a stationary version of 
the connecting-nearest-neighbors-model in [14| . Reach- 
ing a stationary state is independent of m. The model is 



completely specified by four parameters, N, r, p, and m. 

Estimation of model parameters. Social ties are often 
established between two individuals by being introduced 
by a mutual acquaintance. Other modes of social tie for- 
mation, such as random encounters may not lead to tri- 
adic closure. Step 2 in the above protocol captures these 
two linking processes. Ties also change because people 
enter and leave social circles, for example they change 
workplaces, move to different cities, or change their hob- 
bies. This is incorporated in step 3. To calibrate the 
model to a real social multiplex network, M a with N a 
nodes and L a links, the stationarity assumption has to 
be checked, and the parameters for triadic closure r, and 
node-turnover p have to be estimated. Consider the aver- 
age number of nodes entering (An+) and leaving (An~) 
the network M a per time unit. For stationarity to hold 
we demand 



An n > An: 



An: 



(4) 



i.e. the net growth rate is much smaller than the rates 
at which nodes enter or leave the network. The triadic 
closure parameter r a can be directly measured as the 
ratio between the number of links in network M a which - 
at their creation - close at least one triangle, and the total 
number of created links. The node-turnover parameter 
p can be estimated by demanding number of links in the 
model and the real network to be the same. To see this, 
note that one adds on average Al + and removes A^~ 
links per time-step. Stationarity means that A^ + = Al~ . 
Since one link is created at each time-step in either step 
1 or 2, and with probability p, m links are added in step 
3, we have Al + = 1 + pm. Denoting the average degree 
by k = with probability p, in step 3, one removes 
on average k links per time-step, Al~ = pk. To calibrate 
the model to a network M a the turnover parameter p a is 



1 



Pa 



k a 



(5) 



The model is initialized with N a nodes and the dynamics 
follows the protocol with parameters r a and p a . 

Multiplex data. The calibration requires complete, 
time-resolved topological information M a (t) over a large 
number of link-creation processes. Suitable data is avail- 
able for example in the social multiplex network of the 
online game Tardus ' d [H-i^. This unique dataset 
allows to continuously track all actions of more than 
370,000 players in an open-ended, virtual, futuristic game 
universe where players interact in a multitude of ways 
to achieve their self-posed goals, such as accumulating 
wealth and influence. Players can establish friendship 
links, exchange one-to-one messages (similar to phone 
calls) and trade with each other. We focus on three sub- 
networks (friendship, communication, trade) of the mul- 
tiplex, over one year from Sep 2007 to Sep 2008. Network 
label a = 1 refers to the friendship network, a — 2 for 
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FIG. 2. Dependence of scaling exponents 7, q, and /3 on the model parameters p and r. (a) 7 becomes closer to one for high p 
or r, and is confined to the interval < 7 < 1. (b) q is large for small p and large r, and approaches one for large p. (c) j3 is 
close to zero for r close to zero, and approaches /9 = — 1 for large values of p and r. 



TABLE I. Summary of network measures and model results. For the Pardus friendship (a = 1), communication (a = 2), and 
trade (a = 3) networks the number of nodes N a , links L a , average degree k a , and average number of nodes entering and leaving 
the network per day, An+ and An~ , are shown. The results of the calibration of the model to the empirical networks, r and 
p, are given, together with the fit results of the parameters 7, q, and /3 for the data and the model. 
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exponents (data and model) 
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N a 


L a 


k a An+ 


An a 


Ta Pa 


7 


Imod q q-mod P 


Pmod 


friends 


1 


4,547 


21,622 


9.5 24.26 


23.07 


0.58 0.12 


0.84 


0.74 1.25 1.22 0.74 


0.74 


communication 


2 


2,810 


9,420 


6.7 110.2 


109.4 


0.57 0.18 


0.86 


0.78 1.29 1.14 0.59 


0.73 


trade 


3 


4,514 


31,475 


13.9 58.58 


56.19 


0.80 0.08 


0.89 


0.83 1.1 1.23 0.66 


0.68 



communication, and a — 3 for trade. In the friendship 
network a node is present on a given day if at least one 
friendship link to another node exists on that day. A node 
is removed if the player either leaves the game or has no 
friendship link. The same holds for the message and trade 
networks, where a link exists between two nodes on day 
t if at least one message (trade) is exchanged within the 
period of six days, [t — 6,i\. 

For details of structural and dynamical properties of 
the Pardus multiplex, see [1, [IS-Sl] • Table [Q summarizes 
key features of M a , including the number of nodes N a , 
links L a , and average degree k a , as measured on the last 
day of the observation record. Table U contains the aver- 
age number of nodes entering (An+) and leaving (An~) 
per day, confirming that the networks are in fact station- 
ary in the sense of Eq. |4l Estimates for r and p are 
also listed in table fl] To measure the degree distributions 
P a (k a ) and clustering coefficients c a (k a ), we use the ad- 
jacency matrix of the networks M a on the last day of 
the data record. The preferential attachment probability 
n Q (fc Q ) is measured by counting (over the entire observa- 
tion period) the number of link-creation events in which 
a node with degree k acquires a new link, and then di- 
viding this by the average number of nodes with degree 
k, where the average is again taken over the observation 
period. 

Simulation results for the values of the characteristic 
exponents 7, q, and f3 in the model depend on the param- 
eters p and r, as shown in FigJ5J We fix N — 10 3 and 



m = 0. Results are averaged over 500 realizations for 
each parameter pair (p, r) . All three scaling exponents 
(EqsQ][3|) can be explained by the model. 

Model exponents for 7 fall in the range < 7 < 1, de- 
pending on p and r, Figj2ja). 7 is close to one for high p 
and high r. The preferential attachment associated with 
triadic closure is therefore sub-linear. The dependence 
of the exponent q on both p and r is shown in Fig(5Jb). 
Note that for q = 1 the ^-exponential is equivalent to the 
exponential. Values of q above (below) one indicate that 
the distribution decays slower (faster) than the exponen- 
tial. For small p and large r, q is significantly larger than 
one and degree distributions are fat-tailed. For large p 
the values of q approach one, independent of r. Values 
for p are close to zero for r = or p going to 0. j3 ap- 
proaches a plateau at f3 = — 1 for high values of p and r, 
see PigSJc). 

For the experimental validation of the model, FigEl 
shows the attachment kernel Tl a (k a ), degree distribu- 
tion P a (k a ), and clustering coefficients c a (k a ) for the 
three sub-networks M a of the empirical multiplex data. 
They are compared to the respective distributions of 
the calibrated model (results averaged over 20 realiza- 
tions). Power-law fits (least-squares) are shown for 7, for 
2 < fc( a ) < 100, and for (3 over the range 5 < fc( Q ) < 100, 
for each a, for data and model. Degree distributions are 
fitted (least-squares) over the entire range fc( Q ) > in 
Fig|3] with EqJ21 For better comparison and to diminish 
the effect of outliers, data and model results for H a (k a ) 
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are normalized over the range k a < 100. Higher val- 
ues correspond to data outliers, often due to behavior of 
non-serious players. 

The observed preferential attachment in the data is in 
good agreement with model results for each network M a , 
see top row of Fig(3] We find exponents of 7 = 0.84 for 
the data and y mo d = 0.74 in the model for the friendship 
network, 7 = 0.86, j mo d — 0.78 for communication, and 
7 = 0.89, 7„ i0 (i = 0.83 for trade. Data and model curves 
for H a (k a ) are barely distinguishable from each other. 
The model fits the number of friends per player with 
exponents q = 1.25 and q mo d = 1-22 for a = 1, q = 
1.29, q mod = 1.14 for a — 2, and q = 1.1 and q mod = 
1.23 for a — 3. Results are shown in the middle row in 
FigEl Data and model show similar scaling of the average 
clustering coefficient of nodes c a (k a ) as a function of their 
de gree k a , see bottom row in Fig [3] For friendships (a — 
1) we find (3 = 0.74, for the model /3 mod = 0.74. For 
communication (a = 2) the data yields /3 = 0.59, the 
model gives (3 mo d = 0.73. For trade (a = 3) there is 
good agreement between data and model with f3 = 0.66 
and (3 m od = 0.68, respectively. Results for the exponents 
7, q, j3 for data and model are summarized in Tab. fl] 

We reported strong evidence that the process of triadic 
closure may play an even more fundamental role in social 
network formation than previously anticipated 



friends (a = 1 ) messages (a = 2) 



trade (a = 3) 
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Given that all model parameters can be measured in the 
data, it is remarkable that three important scaling laws 
are simultaneously explained by this simple triadic clo- 
sure model. Since exponents 7, g, and j3 are sensitive to 
choices of the model parameters p and r, the agreement 
between data and model is even more remarkable. 

The Pardus multiplex data contains three other social 
networks, where links express negative relationships be- 
tween players, such as enmity, attacks, and revenge @. 
Triadic closure is known to be not a good network forma- 
tion process for negative ties, " the enemy of my enemy 
is in general not my enemy" 27J • It was shown that the 



probability of triadic closure between three players is one 
order of magnitude smaller for enmity links when com- 
pared to friendship links in the Pardus multiplex @, [23[ . 
The model is therefore not suited to describe network for- 
mation processes of links expressing negative sentiments. 

The findings in the current model also compare well 
to several facts of real-world social networks. Sub-linear 
preferential attachment has been reported in scientific 
collaboration networks and the actor co-starring network 
(n(fc) oc k 079 and tx fc 81 , respectively Q). Degree dis- 
tributions of many social networks often fall between ex- 
ponential and power-law distributions [2^, [28| , and 
scaling of the average clustering coefficients as a function 
of degree, has been observed in the scientific collabora- 
tion and actor networks with values for c(k) oc fc~ - 77 
and oc fc -0 31 , respectively (when same fitting as in Figj3] 
is applied). Mobile phone and communication networks 
give oc fc" 1 [Hj]. 
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FIG. 3. Network scaling-exponents of the social multiplex can 
be explained by the calibrated model. Results are shown for 
the Pardus friendship (a = 1, left column), communication 
(a = 2, middle column), and trade network (a — 3, right 
column) . Top row: The attachment kernels scales sub- linearly 
with the node degrees in each case for data (7) and model 
(7mod). Curves for data and model are barely distinguishable 
from each other. Middle row: Degree distributions for a 
I 2, 3 and best fits of a g-exponential, for data (<jr) and model 
(qmod)- Bottom row The scaling of the average clustering 
coefficients as a function of degree is very similar in data and 
model. Fits for j3 and Pmod yield almost the same results for 
friends and trades, with comparably larger deviations for the 
communication network. 



In the Pardus dataset players are removed if they 
choose to leave the game or if they are inactive for some 
time 



231 ] . In the mobile communication, actor, and col- 
laboration networks, a link is established by a single ac- 
tion (phone call, movie, or publication) and persists from 
then on. Note that our model addresses the empirically 
relevant case where node-turnover rates (An+ , An~ ) are 
significantly larger than the effective network growth rate 
(An+ ~ An" ). For growing networks (without node 
deletion) it has been shown that sub-linear preferential 
attachment (7 < 1) leads to degree distributions with 



power-law tail with an exponent proportional to 7 11|. 
Our stationary model exhibits exactly the opposite be- 
havior. For r = 1 and large p FigJ^fa) shows that for large 
7 the tail parameter q in FigJ^b) approaches one, that is, 
an exponential distribution. This suggests that the rela- 
tionship between preferential attachment and the shape 
of the degree distribution depends on the balance be- 
tween node addition and removal processes, i.e. whether 
the network is in a stationary or growing regime. 
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